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Abstract 


The main purpose of this note is to clarify the current 
interpretation of the 32-bit IP version 4 address space, whose 
significance has changed substantially since it was originally 
defined. A short section on IPv6é addresses mentions the main points 
of similarity with, and difference from, IPv4. 
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1. Introduction 


The main purpose of this note is to clarify the current 
interpretation of the 32-bit IP version 4 address space, whose 
significance has changed substantially since it was originally 
defined in 1981 [RFC 791]. 


Carpenter, et. al. Informational [Page 1] 


RFC 2101 IPv4 Address Behavior Today February 1997 


This clarification is intended to assist protocol designers, product 
implementors, Internet service providers, and user sites. It aims to 
avoid misunderstandings about IP addresses that can result from the 
substantial changes that have taken place in the last few years, as a 
result of the Internet’s exponential growth. 


A short section on IPv6 addresses mentions the main points of 
similarity with, and difference from, IPv4. 


2. Terminology 


It is well understood that in computer networks, the concepts of 
directories, names, network addresses, and routes are separate and 
must be analysed separately [RFC 1498]. However, it is also 
necessary to sub-divide the concept of "network address" (abbreviated 
to "address" from here on) into at least two notions, namely 
"identifier" and "locator". This was perhaps less well understood 
when RFC 791 was written. 


In this document, the term "host" refers to any system originating 
and/or terminating IPv4 packets, and "router" refers to any system 
forwarding IPv4 packets from one host or router to another. 


For the purposes of this document, an "identifier" is a bit string 
which is used throughout the lifetime of a communication session 
between two hosts, to identify one of the hosts as far as the other 
is concerned. Such an identifier is used to verify the source of 
incoming packets as being truly the other end of the communication 
concerned, e.g. in the TCP pseudo-header [RFC 793] or in an IP 
Security association [RFC 1825]. Traditionally, the source IPv4 
address in every packet is used for this. 


Note that other definitions of "identifier" are sometimes used; this 
document does not claim to discuss the general issue of the semantics 
of end-point identifiers. 


For the purposes of this document, a "locator" is a bit string which 
is used to identify where a particular packet must be delivered, i.e. 
it serves to locate the place in the Internet topology where the 
destination host is attached. Traditionally, the destination IPv4 
address in every packet is used for this. IP routing protocols 
interpret IPv4 addresses as locators and construct routing tables 
based on which routers (which have their own locators) claim to know 
a route towards the locators of particular hosts. 


Both identifiers and locators have requirements of uniqueness, but 


these requirements are different. Identifiers must be unique with 
respect to each set of inter-communicating hosts. Locators must be 
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unique with respect to each set of inter-communicating routers (which 
we will call a routing "realm"). While locators must be unique within 
a given routing realm, this uniqueness (but not routability) could 
extend to more than one realm. Thus we can further distinguish 
between a set of realms with unique locators versus a set of realms 
with non-unique (overlapping) locators. 


Both identifiers and locators have requirements of lifetime, but 
these requirements are different. Identifiers must be valid for at 
least the maximum lifetime of a communication between two hosts. 
Locators must be valid only as long as the routing mechanisms so 
require (which could be shorter or longer than the lifetime of a 
communication). 


It will be noted that it is a contingent fact of history that the 
same address space and the same fields in the IP header (source and 
destination addresses) are used by RFC 791 and RFC 793 for both 
identifiers and locators, and that in the traditional Internet a 
host’s identifier is identical to its locator, as well as being 
spatially unique (unambiguous) and temporally unique (constant). 


These uniqueness conditions had a number of consequences for design 
assumptions of routing (the infrastructure that IPv4 locators enable) 
and transport protocols (that which depends on the IP connectivity). 
Spatial uniqueness of an address meant that it served as both an 
interface identifier and a host identifier, as well as the key to the 
routing table. Temporal uniqueness of an address meant that there 
was no need for TCP implementations to maintain state regarding 
identity of the far end, other than the IP address. Thus IP addresses 
could be used both for end-to-end IP security and for binding upper 
layer sessions. 


Generally speaking, the use of IPv4 addresses as locators has been 
considered more important than their use as identifiers, and whenever 
there has been a conflict between the two uses, the use as a locator 
has prevailed. That is, it has been considered more useful to deliver 
the packet, then worry about how to identify the end points, than to 
provide identity in a packet that cannot be delivered. In other 
words, there has been intensive work on routing protocols and little 
concrete work on other aspects of address usage. 


3. Ideal properties. 


Whatever the constraints mentioned above, it is easy to see the ideal 
properties of identifiers and locators. Identifiers should be 
assigned at birth, never change, and never be re-used. Locators 
should describe the host’s position in the network’s topology, and 
should change whenever the topology changes. 
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Unfortunately neither of the these ideals are met by IPv4 addresses. 
The remainder of this document is intended as a snapshot of the 
current real situation. 


4. Overview of the current situation of IPv4 addresses. 


It is a fact that IPv4 addresses are no longer all globally unique 
and no longer all have indefinite lifetimes. 


4.1 Addresses are no longer globally unique locators 


[RFC 1918] shows how corporate networks, a.k.a. Intranets, may if 
necessary legitimately re-use a subset of the IPv4 address space, 
forming multiple routing realms. At the boundary between two (or 
more) routing realms, we may find a spectrum of devices that 
enables communication between the realms. 


At one end of the spectrum is a pure Application Layer Gateway 
(ALG). Such a device acts as a termination point for the 
application layer data stream, and is visible to an end-user. For 
example, when an end-user Ua in routing realm A wants to 
communicate with an end-user Ub in routing realm B, Ua has first 
to explicitly establish communication with the ALG that 
interconnects A and B, and only via that can Ua establish 
communication with Ub. We term such a gateway a "non-transparent" 
ALG. 


Another form of ALG makes communication through the ALG 
transparent to an end user. Using the previous example, with a 
"transparent" ALG, Ua would not be required to establish explicit 
connectivity to the ALG first, before starting to communicate with 
Ub. Such connectivity will be established transparently to Ua, so 
that Ua would only see connectivity to Ub. 


For completeness, note that it is not necessarily the case that 
communicating via an ALG involves changes to the network header. 
An ALG could be used only at the beginning of a session for the 
purpose of authentication, after which the ALG goes away and 
communication continues natively. 


Both non-transparent and transparent ALGs are required (by 
definition) to understand the syntax and semantics of the 
application data stream. ALGs are very simple from the viewpoint 
of network layer architecture, since they appear as Internet hosts 
in each realm, i.e. they act as origination and termination points 
for communication. 
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At the other end of the spectrum is a Network Address Translator 
(NAT) [RFC 1631]. In the context of this document we define a NAT 
as a device that just modifies the network and the transport layer 
headers, but does not understand the syntax/semantics of the 
application layer data stream (using our terminology what is 
described in RFC1631 is a device that has both the NAT and ALG 
functionality). 


In the standard case of a NAT placed between a corporate network 
using private addresses [RFC 1918] and the public Internet, that 
NAT changes the source IPv4 address in packets going towards the 
Internet, and changes the destination IPv4 address in packets 
coming from the Internet. When a NAT is used to interconnect 
routing realms with overlapping addresses, such as a direct 
connection between two intranets, the NAT may modify both 
addresses in the IP header. Since the NAT modifies address(es) in 
the IP header, the NAT also has to modify the transport (e.g., 
TCP, UDP) pseudo-header checksum. Upon some introspection one 
could observe that when interconnecting routing realms with 
overlapping addresses, the set of operations on the network and 
transport header performed by a NAT forms a (proper) subset of the 
set of operations on the network and transport layer performed by 
a transparent ALG. 


By definition a NAT does not understand syntax and semantics of an 
application data stream. Therefore, a NAT cannot support 
applications that carry IP addresses at the application layer 
(e.g., FTP with PORT or PASV command [RFC 959]). On the other 
hand, a NAT can support any application, as long as such an 
application does not carry IP addresses at the application layer. 
This is in contrast with an ALG that can support only the 
applications coded into the ALG. 


One can conclude that both NATs and ALGs have their own 
limitations, which could constrain their usefulness. Combining NAT 
and ALG functionality in a single device could be used to overcome 
some, but not all, of these limitations. Such a device would use 
the NAT functionality for the applications that do not carry IP 
addresses, and would resort to the ALG functionality when dealing 
with the applications that carry IP addresses. For example, such a 
device would use the NAT functionality to deal with the FTP data 
connection, but would use the ALG functionality to deal with the 
FTP control connection. However, such a device will fail 
completely handling an application that carries IP addresses, when 
the device does not support the application via the ALG 
functionality, but rather handles it via the NAT functionality. 
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Communicating through either ALGs or NATs involves changes to the 
network header (and specifically source and destination 
addresses), and to the transport header. Since IP Security 
authentication headers assume that the addresses in the network 
header are preserved end-to-end, it is not clear how one could 
support IP Security-based authentication between a pair of hosts 
communicating through either an ALG or a NAT. Since IP Security, 
when used for confidentiality, encrypts the entire transport layer 
end-to-end, it is not clear how an ALG or NAT could modify 
encrypted packets as they require to. In other words, both ALGs 
and NATs are likely to force a boundary between two distinct IP 
Security domains, both for authentication and for confidentiality, 
unless specific enhancements to IP Security are designed for this 
purpose. 


Interconnecting routing realms via either ALGs or NATs relies on 
the DNS [RFC 1035]. Specifically, for a given set of 
(interconnected) routing realms, even if network layer addresses 
are no longer unique across the set, fully qualified domain names 
would need to be unique across the set. However, a site that is 
running a NAT or ALG probably needs to run two DNS servers, one 
inside and one outside the NAT or ALG, giving different answers to 
identical queries. This is discussed further in [kre]. DNS 
security [RFC 2065] and some dynamic DNS updates [dns2] will 
presumably not be valid across a NAT/ALG boundary, so we must 
assume that the external DNS server acquires at least part of its 
tables by some other mechanism. 


To summarize, since RFC 1918, we have not really changed the 
spatial uniqueness of an address, so much as recognized that there 
are multiple spaces. i.e. each space is still a routing realm 
such as an intranet, possibly connected to other intranets, or the 
Internet, by NATs or ALGs (see above discussion). The temporal 
uniqueness of an address is unchanged by RFC 1918. 


4.2. Addresses are no longer all temporally unique 


Note that as soon as address significance changes anywhere in the 
address space, it has in some sense changed everywhere. This has 
in fact already happened. 


IPv4 address blocks were for many years assigned chronologically, 
i.e. effectively at random with respect to network topology. 
This led to constantly growing routing tables; this does not 
scale. Today, hierarchical routing (CIDR [RFC 1518], [RFC 1519]) 
is used as a mechanism to improve scaling of routing within a 
routing realm, and especially within the Internet (The Annex goes 
into more details on CIDR). 
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Scaling capabilities of CIDR are based on the assumption that 
address allocation reflects network topology as much as possible, 
and boundaries for aggregation of addressing information are not 
required to be fully contained within a single organization - they 
may span multiple organizations (e.g., provider with its 
subscribers). Thus if a subscriber changes its provider, then to 
avoid injecting additional overhead in the Internet routing 
system, the subscriber may need to renumber. 


Changing providers is just one possible reason for renumbering. 
The informational document [RFC 1900] shows why renumbering is an 
increasingly frequent event. Both DHCP [RFC 1541] and PPP [RFC 
1661] promote the use of dynamic address allocation. 


To summarize, since the development and deployment of DHCP and 
PPP, and since it is expected that renumbering is likely to become 
a common event, IP address significance has indeed been changed. 
Spatial uniqueness should be the same, so addresses are still 
effective locators. Temporal uniqueness is no longer assured. It 
may be quite short, possibly shorter than a TCP connection time. 
In such cases an IP address is no longer a good identifier. This 
has some impact on end-to-end security, and breaks TCP in its 
current form. 


4.3. Multicast and Anycast 


Since we deployed multicast [RFC 1112], we must separate the 
debate over meaning of IP addresses into meaning of source and 
destination addresses. A destination multicast address (i.e. a 
locator for a topologically spread group of hosts) can traverse a 
NAT, and is not necessarily restricted to an intranet (or to the 
public Internet). Its lifetime can be short too. 


The concept of an anycast address is of an address that 
semantically locates any of a group of systems performing 
equivalent functions. There is no way such an address can be 
anything but a locator; it can never serve as an identifier as 
defined in this document, since it does not uniquely identify 
host. In this case, the effective temporal uniqueness, or useful 
lifetime, of an IP address can be less than the time taken to 
establish a TCP connection. 


Here we have used TCP simply to illustrate the idea of an 
association - many UDP based applications (or other systems 
layered on IP) allocate state after receiving or sending a first 
packet, based on the source and/or destination. All are affected 
by absence of temporal uniqueness whereas only the routing 
infrastructure is affected by spatial uniqueness changes. 
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4.4. Summary 


Due to dynamic address allocation and increasingly frequent 
network renumbering, temporal uniqueness of IPv4 addresses is no 
longer globally guaranteed, which puts their use as identifiers 
into severe question. Due to the proliferation of Intranets, 
spatial uniqueness is also no longer guaranteed across routing 
realms; interconnecting routing realms could be accomplished via 
either ALGs or NATs. In principle such interconnection will have 
less functionality than if those Intranets were directly 
connected. In practice the difference in functionality may or may 
not matter, depending on individual circumstances. 


5. IPv6 Considerations 


As far as temporal uniqueness (identifier-like behaviour) is 
concerned, the IPv6 model [RFC 1884] is very similar to the current 
state of the IPv4 model, only more so. IPv6 will provide mechanisms 
to autoconfigure IPv6 addresses on IPv6 hosts. Prefix changes, 
requiring the global IPv6 addresses of all hosts under a given prefix 
to change, are to be expected. Thus, IPv6 will amplify the existing 
problem of finding stable identifiers to be used for end-to-end 
security and for session bindings such as TCP state. 


The IAB feels that this is unfortunate, and that the transition to 
IPv6 would be an ideal occasion to provide upper layer end-to-end 
protocols with temporally unique identifiers. The exact nature of 
these identifiers requires further study. 


As far as spatial uniqueness (locator-like behaviour) is concerned, 
the IPv6 address space is so big that a shortage of addresses, 
requiring an RFC 1918-like approach and address translation, is 
hardly conceivable. Although there is no shortage of IPv6 addresses, 
there is also a well-defined mechanism for obtaining link-local and 
site-local addresses in IPv6 [RFC 1884, section 2.4.8]. These 
properties of IPv6 do not prevent separate routing realms for IPv6, 
if so desired (resulting in multiple security domains as well). 

While at the present moment we cannot identify a case in which 
multiple IPv6é routing realms would be required, it is also hard to 
give a definitive answer to whether there will be only one, or more 
than one IPv6 routing realms. If one hypothesises that there will be 
more than one IPv6 routing realm, then such realms could be 
interconnected together via ALGs and NATs. Considerations for such 
ALGs and NATsS appear to be identical to those for IPv4. 


Carpenter, et. al. Informational [Page 8] 


RFC 2101 IPv4 Address Behavior Today February 1997 


ANNEX: Current Practices for IPv4 Address Allocation & Routing 


Initially IP address structure and IP routing were designed around 
the notion of network number classes (Class A/B/C networks) [RFC 
790]. In the earlier 90s growth of the Internet demanded significant 
improvements in both the scalability of the Internet routing system, 
as well as in the IP address space utilization. Classful structure 
of IP address space and associated with it classful routing turned 
out to be inadequate to meet the demands, so during 1992 - 1993 
period the Internet adopted Classless Inter-Domain Routing (CIDR) 
[RFC 1380], [RFC 1518], [RFC 1519]. CIDR encompasses a new address 
allocation architecture, new routing protocols, anda new structure 
of IP addresses. 


CIDR improves scalability of the Internet routing system by extending 
the notion of hierarchical routing beyond the level of individual 
subnets and networks, to allow routing information aggregation not 
only at the level of individual subnets and networks, but at the 
level of individual sites, as well as at the level of Internet 
Service Providers. Thus an organization (site) could act as an 
aggregator for all the destinations within the organization. 
Likewise, a provider could act as an aggregator for all the 
destinations within its subscribers (organizations directly connected 
to the provider). 


Extending the notion of hierarchical routing to the level of 
individual sites and providers, and allowing sites and providers to 
act as aggregators of routing information, required changes both to 
the address allocation procedures, and to the routing protocols. 
While in pre-CIDR days address allocation to sites was done without 
taking into consideration the need to aggregate the addressing 
information above the level of an individual network numbers, CIDR- 
based allocation recommends that address allocation be done in such 
a way as to enable sites and providers to act as aggregators of 
addressing information - such allocation is called "aggregator 
based". To benefit from the "aggregator based" address allocation, 
CIDR introduces an inter-domain routing protocol (BGP-4) [RFC 1771, 
RFC 1772] that provides capabilities for routing information 
aggregation at the level of individual sites and providers. 


CIDR improves address space utilization by eliminating the notion of 
network classes, and replacing it with the notion of contiguous 
variable size (power of 2) address blocks. This allows a better match 
between the amount of address space requested and the amount of 
address space allocated [RFC 1466]. It also facilitates "aggregator 
based" address allocation. Eliminating the notion of network classes 
requires new capabilities in the routing protocols (both intra and 
inter-domain), and IP forwarding. Specifically, the CIDR-capable 
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protocols are required to handle reachability (addressing) 
information expressed in terms of variable length address prefixes, 
and forwarding is required to implement the "longest match" 
algorithm. CIDR implications on routing protocols are described in 
[RFC 1817]. 


The scaling capabilities of CIDR are based on the assumption that 
address allocation reflects network topology as much as possible, 
especially at the level of sites, and their interconnection with 
providers, to enable sites and providers to act as aggregators. If a 
site changes its provider, then to avoid injecting additional 
overhead in the Internet routing system, the site may need to 
renumber. While CIDR does not require every site that changes its 
providers to renumber, it is important to stress that if none of the 
sites that change their providers will renumber, the Internet routing 
system might collapse due to the excessive amount of routing 
information it would need to handle. 


Maintaining "aggregator based" address allocation (to promote 
scalable routing), and the need to support the ability of sites to 
change their providers (to promote competition) demands practical 
solutions for renumbering sites. The need to contain the overhead 
in a rapidly growing Internet routing system is likely to make 
renumbering more and more common [RFC 1900]. 


The need to scale the Internet routing system, and the use of CIDR as 
the primary mechanism for scaling, results in the evolution of 
address allocation and management policies for the Internet. This 
evolution results in adding the "address lending" policy as an 
alternative to the "address ownership" policy [RFC 2008]. 


IP addressing and routing have been in constant evolution since IP 
was first specified [RFC 791]. Some of the addressing and routing 
principles have been deprecated, some of the principles have been 
preserved, while new principles have been introduced. Current 
Internet routing and addresses (based on CIDR) is an evolutionary 
step that extends the use of hierarchy to maintain a routable global 
Internet. 


Security Considerations 


The impact of the IP addressing model on security is discussed in 
sections 4.1 and 5 of this document. 
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